Cyclades board game tracking¶

  • Karol Cyganik, 148250
  • Sebastian Chwilczyński 148248

Repo
Data

In this notebook we will present our solution for the Cyclades board game tracking. We will show how we implemented the game logic and how we tracked objects on the board, as well as events.


This game is for 2-4 players, we simulated the game for 3 players. The game is played on board, which can be divided into 2 parts - left, where every player has to bet some money for chosen god, and right, where the game is played. The game is played in turns, where every player can do one of the following actions: * Bet money for chosen god * Place a ship on the board * Place a warrior on the board * Fight with other player * Place cities on the board

We decided to record movies with 3 levels of difficulty. It turned out, that even the first one is very hard corresponding to the level 3 in the project description. There we have an angled view of the board, slightly shaking camera view and sometimes elements are covered by hand. The second level is harder, as it introduces slightly different angle and different light conditions (random shadows). The third one is almost impossible to process, because we have there a dynamic camera view with a acute angle and random shadows. We also have a lot of objects on the board, which makes it even harder to track.

We decided to track several objects:

  • Ships (and their colors)
  • Warriors (and their colors)
  • Counters (and their colors, for betting)
  • Gods' cards
  • Circles, which are used to mark the ares we can put ships/warriors on (with their type)
  • Islands
  • Cities

We also track several events:

  • Placing ships
  • Placing warriors
  • Placing gods' cards
  • Placing counters for betting
  • Placing cities
  • Moving ships
  • Moving warriors
  • Moving counters while betting
  • Drawing a card
  • Assingning an island to a player

As a gameplay status, and score, we track:

  • Total number of ships
  • Total number of warriors
  • Total number of taken islands
  • Total number of cities
  • Which player has which god

Dataset¶

In [39]:
import cv2
import numpy as np
import PIL

def imshow(a):
    a = a.clip(0, 255).astype('uint8')
    if a.ndim == 3:
      if a.shape[2] == 4:
        a = cv2.cvtColor(a, cv2.COLOR_BGRA2RGBA)
      else:
        a = cv2.cvtColor(a, cv2.COLOR_BGR2RGB)
    display(PIL.Image.fromarray(a))

def read_and_concatenate(img_paths, resize=0.4):
    images = []
    for img_path in img_paths:
        images.append(cv2.resize(cv2.imread(f"report_data/{img_path}"), None,fx=resize, fy=resize))
  
    imshow(np.concatenate(images, axis=1))

Level 1¶

As stated above, first level has following conditions:

  • Angled view of the board
  • small camera shakes
  • No random shadows, the light is constant
  • Hands are covering some elements, and making a little bit of shadows

Example of the frames from the first level of difficulty:

In [40]:
read_and_concatenate([f"level1_{x}.png" for x in range(1,4)])

Level 2¶

Here we introduced random shadows and different light conditions

In [28]:
read_and_concatenate([f"level2_{x}.png" for x in range(3)])

Level 3¶

Here we introduced dynamic camera view, and random shadows. We also have a lot of objects on the board, which makes it even harder to track.

In [42]:
read_and_concatenate([f"level3_{x}.png" for x in range(1,4)])

Processing of the video¶

Before we've had applied any tracking technique we processed the video.

At first we perform CLAHE equalization on color image to fight with lightning conditions. board_preparator.equalize_color_image

In [49]:
read_and_concatenate([f"hist{x}.png" for x in range(2)], resize=0.8)

Then to fight with shaky camera we perform image aligment based on keypoints. We have one "ideal" board to which we map every processed frame board_preparator.alignImageToFirstFrame Actually this is the most costly operation of all.

In [52]:
read_and_concatenate([f"warp{x}.jpg" for x in range(3)], resize=0.8)

As you can see image was rotated a little bit here so that keypoints positions corresponds to one another. Still when camera is shaky sometimes view is changed by a lot. To minimalize distortions we reinitialize reference frame sometimes board_preparator.reinitialize_first_frame

Next we want to filter the part left to the board that is not helping at all. We use HoughLines to do so. We always perform this operation on the same empty board reference image so this is deterministic. In reality one could ask user to manually label region to be filtered board_preparator.get_mask_of_left_mess. At every step we clean all the noise multiplying frame by this filtering mask

In [55]:
read_and_concatenate([f"left_line{x}.png" for x in range(4)], resize=1)

Next, we tried to separate the board into 2 parts.

To do so we applied averaging convolution on red and blue channel with kernel of size 10, then after simple thresholding we obtained a mask. Next we took first vertical line from the left where sum of the mask was bigger than the threshold. Co0de of this function is in board_preparator.find_separating_line

In [44]:
read_and_concatenate([f"sep_line{x}.png" for x in range(3)], resize=1)

Idea for detection¶

Since we really don't know apriori what we want to track (users can play with 4 different colors, we may have different amount of gods etc we can have dozens of items on the board at the same time, users can cover items with hand etc.) but we also wanted to create something working in real time. We used a heuristic. At every frame we perform foreground extraction. If some item has shown in the same place few times in a row, then we check if this is really an object. For most of the time board is very static so why should we use a lot of resources at every frame? We perform more costly operations only when necessary and we do not scan entire board seeking for dozen of items. utils.update_interesting_objects. Also when we detect an object we check at next iterations if object is still here. If object dissapered from one location and shown up at the other we track this as a move. We could detect object and run tracker but it was lost as pawns can move literlly everywhere on the board

In [58]:
read_and_concatenate([f"tracking{x}.png" for x in range(4)], resize=1)

Here, same object was seen three times in the same place, so one should check what it is. We analyze every 10th frame so for 1 second (first 3 frames) we didn't have to perform any operation connected with logic as nothing really happened on the board.

What if we have something on the board from the very beginning?. At the start we pass empty frame through background extractor 10 times. When first frame is different than empty board we see it and we can process every object.

In [65]:
read_and_concatenate([f"immadiate{x}.png" for x in range(2)], resize=0.8)

Unfortunatelly, this doesn't work that well when the lightning conditions change rapidly. We have only one video in excelent lightning condition when the board is entirely empty. In real life this shouldn't be such a big problem as every time we start from nothing on the board. Alternatively we could perform tracking for everything on the first frame but this would add plethora of code, so we stick with this approach.

In [66]:
read_and_concatenate([f"immadiate{x}.png" for x in range(2,4)], resize=0.8)
Right part¶

Now we analyze both parts separately. At the beginning we find circles and create sea/land map segmentation. Green circles are land and blue circles are sea. We couldn't detect all circles which makes analysis a little bit harder utils.find_circles right_part_analyzer.draw_circles right_part_analyzer.label_circles

We find circles using Hough Circles. After finding the cirles on the board we label them using heuristics based on color segmentation. We take colors for land, and water, and calculate the ratio to use them later to label the circles.

In [70]:
read_and_concatenate([f"circles{x}.png" for x in range(2)], resize=0.8)

Using masks above we can define and draw island elipsoids we can also detect who posses that island by checking whose warriars stand there. right_part_analyzer.detect_islands

In [75]:
imshow(cv2.imread("report_data/island.jpg"))

Everytime something new appears we detect object with heuristics. We tried different ones: template_matching - failed due to being rotation varian and keypoints - objects vere to small to detect keypoints at all. Eventually we stick with color filtering that actually works pretty well. Entire logic of objects recognition is in right/left_part_analyzer

In [71]:
read_and_concatenate([f"segmentation_right{x}.png" for x in range(1,7)], resize=0.8)

Here we see detectrion of black ship, red and yellow warrior. We applied this succesfully to other objects types like cities, god cards, pawns.

left part¶

With the left part we use same techniques with one exception. Since we have a lot of noise there after god card was places we 0 the mask in the place where we detect objects. Where object dissaper from that place mask can be reset.

In [76]:
read_and_concatenate([f"mask_clean{x}.png" for x in range(0,4)], resize=0.8)

Displaying scores¶

We count everything and display to the user.

In [79]:
imshow(cv2.imread("report_data/scores.png"))

Entire processing described¶

  • We iterate over 10 frames, which gives 3FPS.

  • For each frame we do the same steps as above, we resize it and equalize color in the frame.

  • Then we try to align current frame to the first frame using previously calculated keypoints and descriptors. We use ORB detector and descriptor, and then we use Brute Force matcher to find the best matches. We find, and use homography to warp the perspective

  • Then, we separete the frame using previously calculated separating line. Then we use foreground extraction, which is further also divided to 2 parts.

  • Foreground extraction is used to see if some new region of interest emerged.

  • Now the most important part takes plase. We look and update the interesting objects for both left, and right side of the board, but separately.

  • Firstly we detect contours on previously shown foreground. Then we use a heuristic, that object must resist for 3 frames (1s) in the foreground to be classified. Then we have classify phase which just uses different color masks. Then we classify the object or set it to the unknown. We detect the warrior/ship type by discovering whether it's on an island, or on a sea using previous segmented board for islands. We filter the objects that are not unknown and have a suitable area. We mark the objects with rectangle with a text that has a proper color font.

  • If the detected counter is a warrior, we assign an island to this player (using colors). We check if the warrior's coordinated are within the island coordinates. Then, we mark the island using the warrior's color

  • We also detect if any counter is moved. When we detect each ship/warrior, we iterate over all previous counter of its type and check if the one is still there. If one isn't, that's a moved counter.

For other objects processing is very similar to what described here for ship/warrior

Results on each dataset¶

(python video player with results)

Conclusion¶

  • This approach is quite efficient and can be easily applied in real-time.
  • Next time better videos should be recorded. Still results are very satisfying
  • When we start with already filled board it is a little bit problematic to reconstruct all object. Maybe some separate stage for analyzing very first frame and detecting all possible object on it could be better.
  • Board games are quite specific and one can figure out many heuristics for them that works quite well
  • Object are most often very small so keypoint detectors fails. Also objects are rotated so template matching fails. We are left with color matching. However, for board games it is very effective.
  • Preprocessing of the frames is crucial, withouth CLAHE and warping this approach would not work at all.
  • Dividing board on parts and processing them separately is beneficial.
  • Interaction with user would make this task easier. At first we would ask to place empty board within the camera, next to label background etc.
  • Decreasing frames per second in this scenario gives big boost to the efficiency and dosn't decrease performance.